NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

STARK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

Wu, Shirley; Zhao, Shiyu; Yasunaga, Michihiro; Huang, Kexin; Cao, Kaidi; Huang, Qian; Ioannidis, Vassilis N; Subbian, Karthik; Zou, James; Leskovec, Jure (December 2024, Advances in neural information processing systems)

Full Text Available
Avatar: Optimizing llm agents for tool usage via contrastive reasoning

Wu, Shirley; Zhao, Shiyu; Huang, Qian; Huang, Kexin; Yasunaga, Michihiro; Cao, Kaidi; Ioannidis, Vassilis N; Subbian, Karthik; Leskovec, Jure; Zou, James (December 2024, Advances in neural information processing systems)

Full Text Available
A foundation model for clinician-centered drug repurposing

https://doi.org/10.1038/s41591-024-03233-x

Huang, Kexin; Chandak, Payal; Wang, Qianwen; Havaldar, Shreyas; Vaid, Akhil; Leskovec, Jure; Nadkarni, Girish N; Glicksberg, Benjamin S; Gehlenborg, Nils; Zitnik, Marinka (December 2024, Nature Medicine)

Full Text Available
RELBENCH: A Benchmark for Deep Learning on Relational Databases

Robinson, Joshua; Ranjan, Rishabh; Hu, Weihua; Huang, Kexin; Han, Jiaqi; Dobles, Alejandro; Fey, Matthias; Lenssen, Jan E; Yuan, Yiwen; Zhang, Zecheng; et al (December 2024, Advances in neural information processing systems)

Full Text Available
Uncertainty Quantification over Graph with Conformalized Graph Neural Networks

Huang, Kexin; Jin, Ying; Candes, Emmanuel; Leskovec, Jure (December 2023, Advances in neural information processing systems)

Graph Neural Networks (GNNs) are powerful machine learning prediction models on graph-structured data. However, GNNs lack rigorous uncertainty estimates, limiting their reliable deployment in settings where the cost of errors is significant. We propose conformalized GNN (CF-GNN), extending conformal prediction (CP) to graph-based models for guaranteed uncertainty estimates. Given an entity in the graph, CF-GNN produces a prediction set/interval that provably contains the true label with pre-defined coverage probability (e.g. 90%). We establish a permutation invariance condition that enables the validity of CP on graph data and provide an exact characterization of the test-time coverage. Besides valid coverage, it is crucial to reduce the prediction set size/interval length for practical use. We observe a key connection between non-conformity scores and network structures, which motivates us to develop a topology-aware output correction model that learns to update the prediction and produces more efficient prediction sets/intervals. Extensive experiments show that CF-GNN achieves any pre-defined target marginal coverage while significantly reducing the prediction set/interval size by up to 74% over the baselines. It also empirically achieves satisfactory conditional coverage over various raw and network features.
more » « less
Full Text Available
High dimensional, tabular deep learning with an auxiliary knowledge graph

Ruiz, Camilo; Ren, Hongyu; Huang, Kexin; Leskovec, Jure (December 2023, Advances in neural information processing systems)

Machine learning models exhibit strong performance on datasets with abundant labeled samples. However, for tabular datasets with extremely high d-dimensional features but limited n samples (i.e. d ≫ n), machine learning models struggle to achieve strong performance due to the risk of overfitting. Here, our key insight is that there is often abundant, auxiliary domain information describing input features which can be structured as a heterogeneous knowledge graph (KG). We propose PLATO, a method that achieves strong performance on tabular data with d ≫ n by using an auxiliary KG describing input features to regularize a multilayer perceptron (MLP). In PLATO, each input feature corresponds to a node in the auxiliary KG. In the MLP’s first layer, each input feature also corresponds to a weight vector. PLATO is based on the inductive bias that two input features corresponding to similar nodes in the auxiliary KG should have similar weight vectors in the MLP’s first layer. PLATO captures this inductive bias by inferring the weight vector for each input feature from its corresponding node in the KG via a trainable message-passing function. Across 6 d ≫ n datasets, PLATO outperforms 13 state-of-the-art baselines by up to 10.19%.
more » « less
Full Text Available
Predicting transcriptional outcomes of novel multigene perturbations with GEARS

https://doi.org/10.1038/s41587-023-01905-6

Roohani, Yusuf; Huang, Kexin; Leskovec, Jure (August 2023, Nature Biotechnology)

Abstract Understanding cellular responses to genetic perturbation is central to numerous biomedical applications, from identifying genetic interactions involved in cancer to developing methods for regenerative medicine. However, the combinatorial explosion in the number of possible multigene perturbations severely limits experimental interrogation. Here, we present graph-enhanced gene activation and repression simulator (GEARS), a method that integrates deep learning with a knowledge graph of gene–gene relationships to predict transcriptional responses to both single and multigene perturbations using single-cell RNA-sequencing data from perturbational screens. GEARS is able to predict outcomes of perturbing combinations consisting of genes that were never experimentally perturbed. GEARS exhibited 40% higher precision than existing approaches in predicting four distinct genetic interaction subtypes in a combinatorial perturbation screen and identified the strongest interactions twice as well as prior approaches. Overall, GEARS can predict phenotypically distinct effects of multigene perturbations and thus guide the design of perturbational experiments.
more » « less
Full Text Available
DRMref: comprehensive reference map of drug resistance mechanisms in human cancer

https://doi.org/10.1093/nar/gkad1087

Liu, Xiaona; Yi, Jiahao; Li, Tina; Wen, Jianguo; Huang, Kexin; Liu, Jiajia; Wang, Grant; Kim, Pora; Song, Qianqian; Zhou, Xiaobo (November 2023, Nucleic Acids Research)

Abstract Drug resistance poses a significant challenge in cancer treatment. Despite the initial effectiveness of therapies such as chemotherapy, targeted therapy and immunotherapy, many patients eventually develop resistance. To gain deep insights into the underlying mechanisms, single-cell profiling has been performed to interrogate drug resistance at cell level. Herein, we have built the DRMref database (https://ccsm.uth.edu/DRMref/) to provide comprehensive characterization of drug resistance using single-cell data from drug treatment settings. The current version of DRMref includes 42 single-cell datasets from 30 studies, covering 382 samples, 13 major cancer types, 26 cancer subtypes, 35 treatment regimens and 42 drugs. All datasets in DRMref are browsable and searchable, with detailed annotations provided. Meanwhile, DRMref includes analyses of cellular composition, intratumoral heterogeneity, epithelial–mesenchymal transition, cell–cell interaction and differentially expressed genes in resistant cells. Notably, DRMref investigates the drug resistance mechanisms (e.g. Aberration of Drug’s Therapeutic Target, Drug Inactivation by Structure Modification, etc.) in resistant cells. Additional enrichment analysis of hallmark/KEGG (Kyoto Encyclopedia of Genes and Genomes)/GO (Gene Ontology) pathways, as well as the identification of microRNA, motif and transcription factors involved in resistant cells, is provided in DRMref for user’s exploration. Overall, DRMref serves as a unique single-cell-based resource for studying drug resistance, drug combination therapy and discovering novel drug targets.
more » « less
Artificial intelligence foundation for therapeutic science

https://doi.org/10.1038/s41589-022-01131-2

Huang, Kexin; Fu, Tianfan; Gao, Wenhao; Zhao, Yue; Roohani, Yusuf; Leskovec, Jure; Coley, Connor W.; Xiao, Cao; Sun, Jimeng; Zitnik, Marinka (October 2022, Nature Chemical Biology)

Full Text Available
SkipGNN: predicting molecular interactions with skip-graph networks

https://doi.org/10.1038/s41598-020-77766-9

Huang, Kexin; Xiao, Cao; Glass, Lucas M.; Zitnik, Marinka; Sun, Jimeng (December 2020, Scientific Reports)
null (Ed.)
Abstract Molecular interaction networks are powerful resources for molecular discovery. They are increasingly used with machine learning methods to predict biologically meaningful interactions. While deep learning on graphs has dramatically advanced the prediction prowess, current graph neural network (GNN) methods are mainly optimized for prediction on the basis of direct similarity between interacting nodes. In biological networks, however, similarity between nodes that do not directly interact has proved incredibly useful in the last decade across a variety of interaction networks. Here, we present SkipGNN, a graph neural network approach for the prediction of molecular interactions. SkipGNN predicts molecular interactions by not only aggregating information from direct interactions but also from second-order interactions, which we call skip similarity. In contrast to existing GNNs, SkipGNN receives neural messages from two-hop neighbors as well as immediate neighbors in the interaction network and non-linearly transforms the messages to obtain useful information for prediction. To inject skip similarity into a GNN, we construct a modified version of the original network, called the skip graph. We then develop an iterative fusion scheme that optimizes a GNN using both the skip graph and the original graph. Experiments on four interaction networks, including drug–drug, drug–target, protein–protein, and gene–disease interactions, show that SkipGNN achieves superior and robust performance. Furthermore, we show that unlike popular GNNs, SkipGNN learns biologically meaningful embeddings and performs especially well on noisy, incomplete interaction networks.
more » « less
Full Text Available

« Prev Next »

Search for: All records